Overview
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 3066766 |
| Missing cells | 358715 |
| Missing cells (%) | 0.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 565.6 MiB |
| Average record size in memory | 193.4 B |
Variable types
| Categorical | 5 |
|---|---|
| DateTime | 2 |
| Numeric | 11 |
| Boolean | 1 |
RatecodeID is highly overall correlated with tolls_amount | High correlation |
VendorID is highly overall correlated with extra | High correlation |
congestion_surcharge is highly overall correlated with improvement_surcharge | High correlation |
extra is highly overall correlated with VendorID | High correlation |
fare_amount is highly overall correlated with total_amount and 1 other fields | High correlation |
improvement_surcharge is highly overall correlated with congestion_surcharge | High correlation |
tip_amount is highly overall correlated with total_amount | High correlation |
tolls_amount is highly overall correlated with RatecodeID | High correlation |
total_amount is highly overall correlated with fare_amount and 2 other fields | High correlation |
trip_distance is highly overall correlated with fare_amount and 1 other fields | High correlation |
store_and_fwd_flag is highly imbalanced (94.2%) | Imbalance |
payment_type is highly imbalanced (59.0%) | Imbalance |
improvement_surcharge is highly imbalanced (96.1%) | Imbalance |
congestion_surcharge is highly imbalanced (71.7%) | Imbalance |
airport_fee is highly imbalanced (72.2%) | Imbalance |
passenger_count has 71743 (2.3%) missing values | Missing |
RatecodeID has 71743 (2.3%) missing values | Missing |
store_and_fwd_flag has 71743 (2.3%) missing values | Missing |
congestion_surcharge has 71743 (2.3%) missing values | Missing |
airport_fee has 71743 (2.3%) missing values | Missing |
trip_distance is highly skewed (γ1 = 810.4075091) | Skewed |
mta_tax is highly skewed (γ1 = 35.31543467) | Skewed |
passenger_count has 51164 (1.7%) zeros | Zeros |
trip_distance has 45862 (1.5%) zeros | Zeros |
extra has 1240718 (40.5%) zeros | Zeros |
tip_amount has 694757 (22.7%) zeros | Zeros |
tolls_amount has 2840307 (92.6%) zeros | Zeros |
Reproduction
| Analysis started | 2026-01-08 22:53:18.550213 |
|---|---|
| Analysis finished | 2026-01-08 22:57:00.277294 |
| Duration | 3 minutes and 41.73 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
VendorID
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 146.2 MiB |
| 2 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 1 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3066766 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3066766 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3066766 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2239399 | |
| 1 | 827367 | 27.0% |
tpep_pickup_datetime
Date
| Distinct | 1610975 |
|---|---|
| Distinct (%) | 52.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 23.4 MiB |
| Minimum | 2008-12-31 23:01:42 |
|---|---|
| Maximum | 2023-02-01 00:56:53 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
| Distinct | 1611319 |
|---|---|
| Distinct (%) | 52.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 23.4 MiB |
| Minimum | 2009-01-01 14:29:11 |
|---|---|
| Maximum | 2023-02-02 09:28:47 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
passenger_count
Real number (ℝ)
Missing Zeros
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71743 |
| Missing (%) | 2.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.3625321 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 51164 |
| Zeros (%) | 1.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.89611997 |
|---|---|
| Coefficient of variation (CV) | 0.65768724 |
| Kurtosis | 9.5142525 |
| Mean | 1.3625321 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.8753862 |
| Sum | 4080815 |
| Variance | 0.80303101 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2261400 | |
| 2 | 451536 | 14.7% |
| 3 | 106353 | 3.5% |
| 4 | 53745 | 1.8% |
| 0 | 51164 | 1.7% |
| 5 | 42681 | 1.4% |
| 6 | 28124 | 0.9% |
| 8 | 13 | < 0.1% |
| 7 | 6 | < 0.1% |
| 9 | 1 | < 0.1% |
| (Missing) | 71743 | 2.3% |
| Value | Count | Frequency (%) |
| 0 | 51164 | 1.7% |
| 1 | 2261400 | |
| 2 | 451536 | 14.7% |
| 3 | 106353 | 3.5% |
| 4 | 53745 | 1.8% |
| 5 | 42681 | 1.4% |
| 6 | 28124 | 0.9% |
| 7 | 6 | < 0.1% |
| 8 | 13 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 1 | < 0.1% |
| 8 | 13 | < 0.1% |
| 7 | 6 | < 0.1% |
| 6 | 28124 | 0.9% |
| 5 | 42681 | 1.4% |
| 4 | 53745 | 1.8% |
| 3 | 106353 | 3.5% |
| 2 | 451536 | 14.7% |
| 1 | 2261400 | |
| 0 | 51164 | 1.7% |
trip_distance
Real number (ℝ)
High correlation Skewed Zeros
| Distinct | 4387 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.847342 |
| Minimum | 0 |
|---|---|
| Maximum | 258928.15 |
| Zeros | 45862 |
| Zeros (%) | 1.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.5 |
| Q1 | 1.06 |
| median | 1.8 |
| Q3 | 3.33 |
| 95-th percentile | 14.32 |
| Maximum | 258928.15 |
| Range | 258928.15 |
| Interquartile range (IQR) | 2.27 |
Descriptive statistics
| Standard deviation | 249.58376 |
|---|---|
| Coefficient of variation (CV) | 64.871736 |
| Kurtosis | 726436.93 |
| Mean | 3.847342 |
| Median Absolute Deviation (MAD) | 0.9 |
| Skewness | 810.40751 |
| Sum | 11798898 |
| Variance | 62292.051 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 45862 | 1.5% |
| 1 | 43827 | 1.4% |
| 0.9 | 43473 | 1.4% |
| 1.1 | 42578 | 1.4% |
| 0.8 | 41801 | 1.4% |
| 1.2 | 41147 | 1.3% |
| 1.3 | 39793 | 1.3% |
| 0.7 | 38108 | 1.2% |
| 1.4 | 37286 | 1.2% |
| 1.5 | 35544 | 1.2% |
| Other values (4377) | 2657347 |
| Value | Count | Frequency (%) |
| 0 | 45862 | |
| 0.01 | 2137 | 0.1% |
| 0.02 | 1419 | < 0.1% |
| 0.03 | 1156 | < 0.1% |
| 0.04 | 862 | < 0.1% |
| 0.05 | 697 | < 0.1% |
| 0.06 | 565 | < 0.1% |
| 0.07 | 582 | < 0.1% |
| 0.08 | 471 | < 0.1% |
| 0.09 | 448 | < 0.1% |
| Value | Count | Frequency (%) |
| 258928.15 | 1 | |
| 225987.37 | 1 | |
| 187872.33 | 1 | |
| 116439.71 | 1 | |
| 85543.66 | 1 | |
| 76886.52 | 1 | |
| 62359.52 | 1 | |
| 52042.3 | 1 | |
| 33205.32 | 1 | |
| 16562.61 | 1 |
RatecodeID
Real number (ℝ)
High correlation Missing
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71743 |
| Missing (%) | 2.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.4974396 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 6.4747667 |
|---|---|
| Coefficient of variation (CV) | 4.3238918 |
| Kurtosis | 222.02956 |
| Mean | 1.4974396 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 14.943792 |
| Sum | 4484866 |
| Variance | 41.922604 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2839305 | |
| 2 | 114239 | 3.7% |
| 5 | 15043 | 0.5% |
| 99 | 13106 | 0.4% |
| 3 | 8958 | 0.3% |
| 4 | 4366 | 0.1% |
| 6 | 6 | < 0.1% |
| (Missing) | 71743 | 2.3% |
| Value | Count | Frequency (%) |
| 1 | 2839305 | |
| 2 | 114239 | 3.7% |
| 3 | 8958 | 0.3% |
| 4 | 4366 | 0.1% |
| 5 | 15043 | 0.5% |
| 6 | 6 | < 0.1% |
| 99 | 13106 | 0.4% |
| Value | Count | Frequency (%) |
| 99 | 13106 | 0.4% |
| 6 | 6 | < 0.1% |
| 5 | 15043 | 0.5% |
| 4 | 4366 | 0.1% |
| 3 | 8958 | 0.3% |
| 2 | 114239 | 3.7% |
| 1 | 2839305 |
store_and_fwd_flag
Boolean
Imbalance Missing
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71743 |
| Missing (%) | 2.3% |
| Memory size | 5.8 MiB |
| False | |
|---|---|
| True | 20003 |
| (Missing) | 71743 |
| Value | Count | Frequency (%) |
| False | 2975020 | |
| True | 20003 | 0.7% |
| (Missing) | 71743 | 2.3% |
PULocationID
Real number (ℝ)
| Distinct | 257 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 166.39805 |
| Minimum | 1 |
|---|---|
| Maximum | 265 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 48 |
| Q1 | 132 |
| median | 162 |
| Q3 | 234 |
| 95-th percentile | 261 |
| Maximum | 265 |
| Range | 264 |
| Interquartile range (IQR) | 102 |
Descriptive statistics
| Standard deviation | 64.244131 |
|---|---|
| Coefficient of variation (CV) | 0.38608705 |
| Kurtosis | -0.86450402 |
| Mean | 166.39805 |
| Median Absolute Deviation (MAD) | 62 |
| Skewness | -0.25597836 |
| Sum | 5.1030387 × 108 |
| Variance | 4127.3083 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 132 | 160030 | 5.2% |
| 237 | 148074 | 4.8% |
| 236 | 138391 | 4.5% |
| 161 | 135417 | 4.4% |
| 186 | 109227 | 3.6% |
| 162 | 105334 | 3.4% |
| 142 | 100228 | 3.3% |
| 230 | 98991 | 3.2% |
| 138 | 89188 | 2.9% |
| 170 | 88346 | 2.9% |
| Other values (247) | 1893540 |
| Value | Count | Frequency (%) |
| 1 | 410 | < 0.1% |
| 2 | 2 | < 0.1% |
| 3 | 39 | < 0.1% |
| 4 | 3649 | |
| 5 | 56 | < 0.1% |
| 6 | 48 | < 0.1% |
| 7 | 1510 | |
| 8 | 11 | < 0.1% |
| 9 | 44 | < 0.1% |
| 10 | 1356 | < 0.1% |
| Value | Count | Frequency (%) |
| 265 | 1647 | 0.1% |
| 264 | 40116 | |
| 263 | 66128 | |
| 262 | 43760 | |
| 261 | 12842 | 0.4% |
| 260 | 640 | < 0.1% |
| 259 | 74 | < 0.1% |
| 258 | 70 | < 0.1% |
| 257 | 69 | < 0.1% |
| 256 | 967 | < 0.1% |
DOLocationID
Real number (ℝ)
| Distinct | 261 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 164.39263 |
| Minimum | 1 |
|---|---|
| Maximum | 265 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 43 |
| Q1 | 114 |
| median | 162 |
| Q3 | 234 |
| 95-th percentile | 262 |
| Maximum | 265 |
| Range | 264 |
| Interquartile range (IQR) | 120 |
Descriptive statistics
| Standard deviation | 69.943682 |
|---|---|
| Coefficient of variation (CV) | 0.42546726 |
| Kurtosis | -0.92636707 |
| Mean | 164.39263 |
| Median Absolute Deviation (MAD) | 69 |
| Skewness | -0.36632366 |
| Sum | 5.0415373 × 108 |
| Variance | 4892.1186 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 236 | 146348 | 4.8% |
| 237 | 132364 | 4.3% |
| 161 | 116149 | 3.8% |
| 230 | 89878 | 2.9% |
| 170 | 88783 | 2.9% |
| 239 | 87969 | 2.9% |
| 142 | 87969 | 2.9% |
| 141 | 87655 | 2.9% |
| 162 | 82739 | 2.7% |
| 48 | 77383 | 2.5% |
| Other values (251) | 2069529 |
| Value | Count | Frequency (%) |
| 1 | 7526 | |
| 2 | 23 | < 0.1% |
| 3 | 198 | < 0.1% |
| 4 | 12165 | |
| 5 | 56 | < 0.1% |
| 6 | 82 | < 0.1% |
| 7 | 9434 | |
| 8 | 45 | < 0.1% |
| 9 | 259 | < 0.1% |
| 10 | 4210 | 0.1% |
| Value | Count | Frequency (%) |
| 265 | 10958 | 0.4% |
| 264 | 22591 | 0.7% |
| 263 | 69319 | |
| 262 | 51502 | |
| 261 | 12427 | 0.4% |
| 260 | 2343 | 0.1% |
| 259 | 366 | < 0.1% |
| 258 | 652 | < 0.1% |
| 257 | 1238 | < 0.1% |
| 256 | 6842 | 0.2% |
payment_type
Categorical
Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 146.2 MiB |
| 1 | |
|---|---|
| 2 | |
| 0 | 71743 |
| 4 | 33297 |
| 3 | 18023 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3066766 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3066766 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3066766 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2411462 | |
| 2 | 532241 | 17.4% |
| 0 | 71743 | 2.3% |
| 4 | 33297 | 1.1% |
| 3 | 18023 | 0.6% |
fare_amount
Real number (ℝ)
High correlation
| Distinct | 6873 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.367069 |
| Minimum | -900 |
|---|---|
| Maximum | 1160.1 |
| Zeros | 1110 |
| Zeros (%) | < 0.1% |
| Negative | 25049 |
| Negative (%) | 0.8% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -900 |
|---|---|
| 5-th percentile | 5.8 |
| Q1 | 8.6 |
| median | 12.8 |
| Q3 | 20.5 |
| 95-th percentile | 65.3 |
| Maximum | 1160.1 |
| Range | 2060.1 |
| Interquartile range (IQR) | 11.9 |
Descriptive statistics
| Standard deviation | 17.807822 |
|---|---|
| Coefficient of variation (CV) | 0.96955166 |
| Kurtosis | 49.554467 |
| Mean | 18.367069 |
| Median Absolute Deviation (MAD) | 4.9 |
| Skewness | 3.2212102 |
| Sum | 56327502 |
| Variance | 317.11852 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.6 | 149461 | 4.9% |
| 9.3 | 146821 | 4.8% |
| 7.9 | 146075 | 4.8% |
| 10 | 143521 | 4.7% |
| 7.2 | 139156 | 4.5% |
| 10.7 | 135232 | 4.4% |
| 11.4 | 128910 | 4.2% |
| 6.5 | 122739 | 4.0% |
| 12.1 | 120559 | 3.9% |
| 70 | 113028 | 3.7% |
| Other values (6863) | 1721264 |
| Value | Count | Frequency (%) |
| -900 | 1 | |
| -750 | 1 | |
| -650 | 1 | |
| -600 | 1 | |
| -580 | 1 | |
| -500 | 1 | |
| -497.9 | 1 | |
| -495.1 | 1 | |
| -480 | 1 | |
| -425.8 | 1 |
| Value | Count | Frequency (%) |
| 1160.1 | 1 | < 0.1% |
| 999 | 1 | < 0.1% |
| 900 | 1 | < 0.1% |
| 750 | 1 | < 0.1% |
| 701.6 | 1 | < 0.1% |
| 656.8 | 1 | < 0.1% |
| 655.35 | 1 | < 0.1% |
| 650 | 1 | < 0.1% |
| 625 | 1 | < 0.1% |
| 600 | 3 |
extra
Real number (ℝ)
High correlation Zeros
| Distinct | 68 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.5378416 |
| Minimum | -7.5 |
|---|---|
| Maximum | 12.5 |
| Zeros | 1240718 |
| Zeros (%) | 40.5% |
| Negative | 12407 |
| Negative (%) | 0.4% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -7.5 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 2.5 |
| 95-th percentile | 5 |
| Maximum | 12.5 |
| Range | 20 |
| Interquartile range (IQR) | 2.5 |
Descriptive statistics
| Standard deviation | 1.7895925 |
|---|---|
| Coefficient of variation (CV) | 1.1637041 |
| Kurtosis | 2.1283702 |
| Mean | 1.5378416 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.2686481 |
| Sum | 4716200.2 |
| Variance | 3.2026412 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1240718 | |
| 2.5 | 763716 | |
| 1 | 564096 | |
| 5 | 209329 | 6.8% |
| 3.5 | 172569 | 5.6% |
| 6 | 23442 | 0.8% |
| 7.5 | 21389 | 0.7% |
| 3.75 | 15389 | 0.5% |
| 8.75 | 12373 | 0.4% |
| 1.25 | 9974 | 0.3% |
| Other values (58) | 33771 | 1.1% |
| Value | Count | Frequency (%) |
| -7.5 | 141 | < 0.1% |
| -6 | 256 | < 0.1% |
| -5 | 859 | < 0.1% |
| -4.5 | 1 | < 0.1% |
| -3.5 | 1 | < 0.1% |
| -2.5 | 3757 | 0.1% |
| -1.25 | 6 | < 0.1% |
| -1 | 7383 | 0.2% |
| -0.5 | 3 | < 0.1% |
| 0 | 1240718 |
| Value | Count | Frequency (%) |
| 12.5 | 2 | < 0.1% |
| 11.25 | 2387 | 0.1% |
| 11 | 2 | < 0.1% |
| 10 | 772 | < 0.1% |
| 9.75 | 3115 | 0.1% |
| 9.45 | 1 | < 0.1% |
| 9.3 | 5 | < 0.1% |
| 9.25 | 5 | < 0.1% |
| 8.97 | 5 | < 0.1% |
| 8.75 | 12373 |
mta_tax
Real number (ℝ)
Skewed
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.48828998 |
| Minimum | -0.5 |
|---|---|
| Maximum | 53.16 |
| Zeros | 23421 |
| Zeros (%) | 0.8% |
| Negative | 24501 |
| Negative (%) | 0.8% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -0.5 |
|---|---|
| 5-th percentile | 0.5 |
| Q1 | 0.5 |
| median | 0.5 |
| Q3 | 0.5 |
| 95-th percentile | 0.5 |
| Maximum | 53.16 |
| Range | 53.66 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1034641 |
|---|---|
| Coefficient of variation (CV) | 0.21189069 |
| Kurtosis | 21970.425 |
| Mean | 0.48828998 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 35.315435 |
| Sum | 1497471.1 |
| Variance | 0.01070482 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.5 | 3018062 | |
| -0.5 | 24501 | 0.8% |
| 0 | 23421 | 0.8% |
| 0.8 | 771 | < 0.1% |
| 4 | 4 | < 0.1% |
| 0.3 | 3 | < 0.1% |
| 1.6 | 1 | < 0.1% |
| 53.16 | 1 | < 0.1% |
| 1.09 | 1 | < 0.1% |
| 1.05 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| -0.5 | 24501 | 0.8% |
| 0 | 23421 | 0.8% |
| 0.3 | 3 | < 0.1% |
| 0.5 | 3018062 | |
| 0.8 | 771 | < 0.1% |
| 1.05 | 1 | < 0.1% |
| 1.09 | 1 | < 0.1% |
| 1.6 | 1 | < 0.1% |
| 4 | 4 | < 0.1% |
| 53.16 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 53.16 | 1 | < 0.1% |
| 4 | 4 | < 0.1% |
| 1.6 | 1 | < 0.1% |
| 1.09 | 1 | < 0.1% |
| 1.05 | 1 | < 0.1% |
| 0.8 | 771 | < 0.1% |
| 0.5 | 3018062 | |
| 0.3 | 3 | < 0.1% |
| 0 | 23421 | 0.8% |
| -0.5 | 24501 | 0.8% |
tip_amount
Real number (ℝ)
High correlation Zeros
| Distinct | 4036 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.3679407 |
| Minimum | -96.22 |
|---|---|
| Maximum | 380.8 |
| Zeros | 694757 |
| Zeros (%) | 22.7% |
| Negative | 225 |
| Negative (%) | < 0.1% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -96.22 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 2.72 |
| Q3 | 4.2 |
| 95-th percentile | 11.11 |
| Maximum | 380.8 |
| Range | 477.02 |
| Interquartile range (IQR) | 3.2 |
Descriptive statistics
| Standard deviation | 3.8267595 |
|---|---|
| Coefficient of variation (CV) | 1.1362313 |
| Kurtosis | 92.756546 |
| Mean | 3.3679407 |
| Median Absolute Deviation (MAD) | 1.72 |
| Skewness | 4.2238312 |
| Sum | 10328686 |
| Variance | 14.644088 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 694757 | 22.7% |
| 2 | 152040 | 5.0% |
| 1 | 132857 | 4.3% |
| 3 | 76829 | 2.5% |
| 5 | 42332 | 1.4% |
| 2.8 | 40013 | 1.3% |
| 3.5 | 34750 | 1.1% |
| 1.5 | 34000 | 1.1% |
| 4 | 33148 | 1.1% |
| 2.1 | 32931 | 1.1% |
| Other values (4026) | 1793109 |
| Value | Count | Frequency (%) |
| -96.22 | 1 | < 0.1% |
| -90.09 | 1 | < 0.1% |
| -64.66 | 1 | < 0.1% |
| -51.89 | 1 | < 0.1% |
| -35.03 | 1 | < 0.1% |
| -33.93 | 1 | < 0.1% |
| -20 | 4 | |
| -14.5 | 1 | < 0.1% |
| -13.9 | 1 | < 0.1% |
| -12.81 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 380.8 | 1 | < 0.1% |
| 270.3 | 1 | < 0.1% |
| 222.21 | 1 | < 0.1% |
| 211.5 | 1 | < 0.1% |
| 202 | 1 | < 0.1% |
| 201.65 | 1 | < 0.1% |
| 200.2 | 1 | < 0.1% |
| 200 | 3 | |
| 160 | 1 | < 0.1% |
| 150 | 1 | < 0.1% |
tolls_amount
Real number (ℝ)
High correlation Zeros
| Distinct | 776 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.51849066 |
| Minimum | -65 |
|---|---|
| Maximum | 196.99 |
| Zeros | 2840307 |
| Zeros (%) | 92.6% |
| Negative | 1377 |
| Negative (%) | < 0.1% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -65 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 6.55 |
| Maximum | 196.99 |
| Range | 261.99 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 2.017579 |
|---|---|
| Coefficient of variation (CV) | 3.8912543 |
| Kurtosis | 78.69951 |
| Mean | 0.51849066 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.3893505 |
| Sum | 1590089.5 |
| Variance | 4.0706251 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2840307 | |
| 6.55 | 207651 | 6.8% |
| 12.75 | 1780 | 0.1% |
| 3 | 1602 | 0.1% |
| 14.75 | 1408 | < 0.1% |
| -6.55 | 1165 | < 0.1% |
| 13.1 | 993 | < 0.1% |
| 11.55 | 868 | < 0.1% |
| 11.75 | 863 | < 0.1% |
| 13.75 | 725 | < 0.1% |
| Other values (766) | 9404 | 0.3% |
| Value | Count | Frequency (%) |
| -65 | 1 | |
| -39.3 | 1 | |
| -34.05 | 1 | |
| -30.3 | 1 | |
| -30.05 | 1 | |
| -30 | 2 | |
| -29.85 | 1 | |
| -29.5 | 1 | |
| -27.5 | 1 | |
| -27.3 | 2 |
| Value | Count | Frequency (%) |
| 196.99 | 1 | |
| 92.75 | 1 | |
| 86.55 | 1 | |
| 81.55 | 1 | |
| 81 | 2 | |
| 78 | 1 | |
| 73.75 | 1 | |
| 70 | 1 | |
| 69.3 | 1 | |
| 65 | 1 |
improvement_surcharge
Categorical
High correlation Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 152.1 MiB |
| 1.0 | |
|---|---|
| -1.0 | 25117 |
| 0.3 | 5269 |
| 0.0 | 973 |
| -0.3 | 36 |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.0082018 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 3035371 | |
| -1.0 | 25117 | 0.8% |
| 0.3 | 5269 | 0.2% |
| 0.0 | 973 | < 0.1% |
| -0.3 | 36 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 3060488 | |
| 0.3 | 5305 | 0.2% |
| 0.0 | 973 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3067739 | |
| . | 3066766 | |
| 1 | 3060488 | |
| - | 25153 | 0.3% |
| 3 | 5305 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6133532 | |
| Other Punctuation | 3066766 | |
| Dash Punctuation | 25153 | 0.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3067739 | |
| 1 | 3060488 | |
| 3 | 5305 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3066766 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 25153 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9225451 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 3067739 | |
| . | 3066766 | |
| 1 | 3060488 | |
| - | 25153 | 0.3% |
| 3 | 5305 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9225451 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 3067739 | |
| . | 3066766 | |
| 1 | 3060488 | |
| - | 25153 | 0.3% |
| 3 | 5305 | 0.1% |
total_amount
Real number (ℝ)
High correlation
| Distinct | 15871 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.020383 |
| Minimum | -751 |
|---|---|
| Maximum | 1169.4 |
| Zeros | 568 |
| Zeros (%) | < 0.1% |
| Negative | 25204 |
| Negative (%) | 0.8% |
| Memory size | 23.4 MiB |
Quantile statistics
| Minimum | -751 |
|---|---|
| 5-th percentile | 10.92 |
| Q1 | 15.4 |
| median | 20.16 |
| Q3 | 28.7 |
| 95-th percentile | 80.25 |
| Maximum | 1169.4 |
| Range | 1920.4 |
| Interquartile range (IQR) | 13.3 |
Descriptive statistics
| Standard deviation | 22.163589 |
|---|---|
| Coefficient of variation (CV) | 0.82025443 |
| Kurtosis | 26.595788 |
| Mean | 27.020383 |
| Median Absolute Deviation (MAD) | 5.78 |
| Skewness | 2.8328288 |
| Sum | 82865192 |
| Variance | 491.22468 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 16.8 | 48536 | 1.6% |
| 12.6 | 45398 | 1.5% |
| 21 | 37923 | 1.2% |
| 15.12 | 26389 | 0.9% |
| 15.96 | 26375 | 0.9% |
| 14.28 | 24988 | 0.8% |
| 18.48 | 24939 | 0.8% |
| 17.64 | 24786 | 0.8% |
| 14 | 24305 | 0.8% |
| 19.32 | 24192 | 0.8% |
| Other values (15861) | 2758935 |
| Value | Count | Frequency (%) |
| -751 | 1 | |
| -630.7 | 1 | |
| -603.5 | 1 | |
| -601 | 1 | |
| -583.5 | 1 | |
| -520.55 | 1 | |
| -497.85 | 1 | |
| -481 | 1 | |
| -430.8 | 1 | |
| -401 | 1 |
| Value | Count | Frequency (%) |
| 1169.4 | 1 | |
| 1000 | 1 | |
| 901 | 1 | |
| 751 | 1 | |
| 705.6 | 1 | |
| 667.1 | 1 | |
| 656.85 | 1 | |
| 651 | 1 | |
| 626 | 1 | |
| 614.45 | 1 |
congestion_surcharge
Categorical
High correlation Imbalance Missing
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71743 |
| Missing (%) | 2.3% |
| Memory size | 152.4 MiB |
| 2.5 | |
|---|---|
| 0.0 | 231037 |
| -2.5 | 19718 |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.0065836 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.5 |
|---|---|
| 2nd row | 2.5 |
| 3rd row | 2.5 |
| 4th row | 0.0 |
| 5th row | 2.5 |
Common Values
| Value | Count | Frequency (%) |
| 2.5 | 2744268 | |
| 0.0 | 231037 | 7.5% |
| -2.5 | 19718 | 0.6% |
| (Missing) | 71743 | 2.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.5 | 2763986 | |
| 0.0 | 231037 | 7.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 2995023 | |
| 2 | 2763986 | |
| 5 | 2763986 | |
| 0 | 462074 | 5.1% |
| - | 19718 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5990046 | |
| Other Punctuation | 2995023 | |
| Dash Punctuation | 19718 | 0.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2763986 | |
| 5 | 2763986 | |
| 0 | 462074 | 7.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2995023 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 19718 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9004787 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 2995023 | |
| 2 | 2763986 | |
| 5 | 2763986 | |
| 0 | 462074 | 5.1% |
| - | 19718 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9004787 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 2995023 | |
| 2 | 2763986 | |
| 5 | 2763986 | |
| 0 | 462074 | 5.1% |
| - | 19718 | 0.2% |
airport_fee
Categorical
Imbalance Missing
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 71743 |
| Missing (%) | 2.3% |
| Memory size | 152.6 MiB |
| 0.0 | |
|---|---|
| 1.25 | 260960 |
| -1.25 | 3607 |
Length
| Max length | 5 |
|---|---|
| Median length | 3 |
| Mean length | 3.0895399 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 1.25 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2730456 | |
| 1.25 | 260960 | 8.5% |
| -1.25 | 3607 | 0.1% |
| (Missing) | 71743 | 2.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 2730456 | |
| 1.25 | 264567 | 8.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5460912 | |
| . | 2995023 | |
| 1 | 264567 | 2.9% |
| 2 | 264567 | 2.9% |
| 5 | 264567 | 2.9% |
| - | 3607 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6254613 | |
| Other Punctuation | 2995023 | |
| Dash Punctuation | 3607 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5460912 | |
| 1 | 264567 | 4.2% |
| 2 | 264567 | 4.2% |
| 5 | 264567 | 4.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2995023 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3607 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9253243 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5460912 | |
| . | 2995023 | |
| 1 | 264567 | 2.9% |
| 2 | 264567 | 2.9% |
| 5 | 264567 | 2.9% |
| - | 3607 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9253243 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5460912 | |
| . | 2995023 | |
| 1 | 264567 | 2.9% |
| 2 | 264567 | 2.9% |
| 5 | 264567 | 2.9% |
| - | 3607 | < 0.1% |
Interactions
Correlations
| DOLocationID | PULocationID | RatecodeID | VendorID | airport_fee | congestion_surcharge | extra | fare_amount | improvement_surcharge | mta_tax | passenger_count | payment_type | store_and_fwd_flag | tip_amount | tolls_amount | total_amount | trip_distance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DOLocationID | 1.000 | 0.087 | -0.045 | 0.011 | 0.078 | 0.147 | -0.007 | -0.109 | 0.013 | 0.026 | -0.007 | 0.031 | 0.009 | -0.016 | -0.056 | -0.097 | -0.108 |
| PULocationID | 0.087 | 1.000 | -0.120 | 0.025 | 0.374 | 0.216 | -0.027 | -0.137 | 0.011 | 0.011 | -0.015 | 0.035 | 0.005 | -0.038 | -0.122 | -0.128 | -0.141 |
| RatecodeID | -0.045 | -0.120 | 1.000 | 0.110 | 0.021 | 0.229 | -0.089 | 0.349 | 0.013 | -0.259 | 0.050 | 0.033 | 0.004 | 0.132 | 0.531 | 0.339 | 0.267 |
| VendorID | 0.011 | 0.025 | 0.110 | 1.000 | 0.043 | 0.050 | 0.582 | 0.008 | 0.058 | 0.000 | 0.234 | 0.060 | 0.102 | 0.004 | 0.007 | 0.040 | 0.000 |
| airport_fee | 0.078 | 0.374 | 0.021 | 0.043 | 1.000 | 0.336 | 0.426 | 0.111 | 0.268 | 0.000 | 0.024 | 0.136 | 0.006 | 0.017 | 0.060 | 0.166 | 0.000 |
| congestion_surcharge | 0.147 | 0.216 | 0.229 | 0.050 | 0.336 | 1.000 | 0.286 | 0.077 | 0.629 | 0.000 | 0.016 | 0.366 | 0.007 | 0.029 | 0.111 | 0.099 | 0.000 |
| extra | -0.007 | -0.027 | -0.089 | 0.582 | 0.426 | 0.286 | 1.000 | 0.059 | 0.227 | 0.139 | -0.045 | 0.147 | 0.060 | 0.111 | 0.139 | 0.148 | 0.098 |
| fare_amount | -0.109 | -0.137 | 0.349 | 0.008 | 0.111 | 0.077 | 0.059 | 1.000 | 0.059 | 0.038 | 0.040 | 0.033 | 0.000 | 0.452 | 0.425 | 0.966 | 0.900 |
| improvement_surcharge | 0.013 | 0.011 | 0.013 | 0.058 | 0.268 | 0.629 | 0.227 | 0.059 | 1.000 | 0.032 | 0.011 | 0.276 | 0.023 | 0.035 | 0.070 | 0.066 | 0.009 |
| mta_tax | 0.026 | 0.011 | -0.259 | 0.000 | 0.000 | 0.000 | 0.139 | 0.038 | 0.032 | 1.000 | -0.016 | 0.005 | 0.000 | 0.065 | -0.056 | 0.042 | 0.024 |
| passenger_count | -0.007 | -0.015 | 0.050 | 0.234 | 0.024 | 0.016 | -0.045 | 0.040 | 0.011 | -0.016 | 1.000 | 0.031 | 0.030 | 0.007 | 0.039 | 0.038 | 0.037 |
| payment_type | 0.031 | 0.035 | 0.033 | 0.060 | 0.136 | 0.366 | 0.147 | 0.033 | 0.276 | 0.005 | 0.031 | 1.000 | 0.012 | 0.031 | 0.019 | 0.103 | 0.005 |
| store_and_fwd_flag | 0.009 | 0.005 | 0.004 | 0.102 | 0.006 | 0.007 | 0.060 | 0.000 | 0.023 | 0.000 | 0.030 | 0.012 | 1.000 | 0.000 | 0.007 | 0.004 | 0.000 |
| tip_amount | -0.016 | -0.038 | 0.132 | 0.004 | 0.017 | 0.029 | 0.111 | 0.452 | 0.035 | 0.065 | 0.007 | 0.031 | 0.000 | 1.000 | 0.249 | 0.592 | 0.431 |
| tolls_amount | -0.056 | -0.122 | 0.531 | 0.007 | 0.060 | 0.111 | 0.139 | 0.425 | 0.070 | -0.056 | 0.039 | 0.019 | 0.007 | 0.249 | 1.000 | 0.436 | 0.407 |
| total_amount | -0.097 | -0.128 | 0.339 | 0.040 | 0.166 | 0.099 | 0.148 | 0.966 | 0.066 | 0.042 | 0.038 | 0.103 | 0.004 | 0.592 | 0.436 | 1.000 | 0.876 |
| trip_distance | -0.108 | -0.141 | 0.267 | 0.000 | 0.000 | 0.000 | 0.098 | 0.900 | 0.009 | 0.024 | 0.037 | 0.005 | 0.000 | 0.431 | 0.407 | 0.876 | 1.000 |
Missing values
Sample
| VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | RatecodeID | store_and_fwd_flag | PULocationID | DOLocationID | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | congestion_surcharge | airport_fee | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 2023-01-01 00:32:10 | 2023-01-01 00:40:36 | 1.0 | 0.97 | 1.0 | N | 161 | 141 | 2 | 9.3 | 1.00 | 0.5 | 0.00 | 0.0 | 1.0 | 14.30 | 2.5 | 0.00 |
| 1 | 2 | 2023-01-01 00:55:08 | 2023-01-01 01:01:27 | 1.0 | 1.10 | 1.0 | N | 43 | 237 | 1 | 7.9 | 1.00 | 0.5 | 4.00 | 0.0 | 1.0 | 16.90 | 2.5 | 0.00 |
| 2 | 2 | 2023-01-01 00:25:04 | 2023-01-01 00:37:49 | 1.0 | 2.51 | 1.0 | N | 48 | 238 | 1 | 14.9 | 1.00 | 0.5 | 15.00 | 0.0 | 1.0 | 34.90 | 2.5 | 0.00 |
| 3 | 1 | 2023-01-01 00:03:48 | 2023-01-01 00:13:25 | 0.0 | 1.90 | 1.0 | N | 138 | 7 | 1 | 12.1 | 7.25 | 0.5 | 0.00 | 0.0 | 1.0 | 20.85 | 0.0 | 1.25 |
| 4 | 2 | 2023-01-01 00:10:29 | 2023-01-01 00:21:19 | 1.0 | 1.43 | 1.0 | N | 107 | 79 | 1 | 11.4 | 1.00 | 0.5 | 3.28 | 0.0 | 1.0 | 19.68 | 2.5 | 0.00 |
| 5 | 2 | 2023-01-01 00:50:34 | 2023-01-01 01:02:52 | 1.0 | 1.84 | 1.0 | N | 161 | 137 | 1 | 12.8 | 1.00 | 0.5 | 10.00 | 0.0 | 1.0 | 27.80 | 2.5 | 0.00 |
| 6 | 2 | 2023-01-01 00:09:22 | 2023-01-01 00:19:49 | 1.0 | 1.66 | 1.0 | N | 239 | 143 | 1 | 12.1 | 1.00 | 0.5 | 3.42 | 0.0 | 1.0 | 20.52 | 2.5 | 0.00 |
| 7 | 2 | 2023-01-01 00:27:12 | 2023-01-01 00:49:56 | 1.0 | 11.70 | 1.0 | N | 142 | 200 | 1 | 45.7 | 1.00 | 0.5 | 10.74 | 3.0 | 1.0 | 64.44 | 2.5 | 0.00 |
| 8 | 2 | 2023-01-01 00:21:44 | 2023-01-01 00:36:40 | 1.0 | 2.95 | 1.0 | N | 164 | 236 | 1 | 17.7 | 1.00 | 0.5 | 5.68 | 0.0 | 1.0 | 28.38 | 2.5 | 0.00 |
| 9 | 2 | 2023-01-01 00:39:42 | 2023-01-01 00:50:36 | 1.0 | 3.01 | 1.0 | N | 141 | 107 | 2 | 14.9 | 1.00 | 0.5 | 0.00 | 0.0 | 1.0 | 19.90 | 2.5 | 0.00 |
| VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | RatecodeID | store_and_fwd_flag | PULocationID | DOLocationID | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | congestion_surcharge | airport_fee | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3066756 | 1 | 2023-01-31 23:05:36 | 2023-01-31 23:20:37 | NaN | 0.00 | NaN | None | 161 | 148 | 0 | 12.74 | 0.0 | 0.5 | 0.00 | 0.0 | 1.0 | 16.74 | NaN | NaN |
| 3066757 | 2 | 2023-01-31 23:08:54 | 2023-01-31 23:32:23 | NaN | 9.44 | NaN | None | 231 | 83 | 0 | 33.08 | 0.0 | 0.5 | 5.56 | 0.0 | 1.0 | 42.64 | NaN | NaN |
| 3066758 | 1 | 2023-01-31 23:10:56 | 2023-01-31 23:23:37 | NaN | 0.00 | NaN | None | 162 | 151 | 0 | 12.00 | 1.0 | 0.5 | 9.40 | 0.0 | 1.0 | 28.40 | NaN | NaN |
| 3066759 | 1 | 2023-01-31 23:54:02 | 2023-02-01 00:23:17 | NaN | 0.00 | NaN | None | 68 | 160 | 0 | 27.00 | 1.0 | 0.5 | 10.55 | 0.0 | 1.0 | 44.55 | NaN | NaN |
| 3066760 | 2 | 2023-01-31 23:30:20 | 2023-01-31 23:34:38 | NaN | 0.82 | NaN | None | 231 | 144 | 0 | 15.21 | 0.0 | 0.5 | 3.84 | 0.0 | 1.0 | 23.05 | NaN | NaN |
| 3066761 | 2 | 2023-01-31 23:58:34 | 2023-02-01 00:12:33 | NaN | 3.05 | NaN | None | 107 | 48 | 0 | 15.80 | 0.0 | 0.5 | 3.96 | 0.0 | 1.0 | 23.76 | NaN | NaN |
| 3066762 | 2 | 2023-01-31 23:31:09 | 2023-01-31 23:50:36 | NaN | 5.80 | NaN | None | 112 | 75 | 0 | 22.43 | 0.0 | 0.5 | 2.64 | 0.0 | 1.0 | 29.07 | NaN | NaN |
| 3066763 | 2 | 2023-01-31 23:01:05 | 2023-01-31 23:25:36 | NaN | 4.67 | NaN | None | 114 | 239 | 0 | 17.61 | 0.0 | 0.5 | 5.32 | 0.0 | 1.0 | 26.93 | NaN | NaN |
| 3066764 | 2 | 2023-01-31 23:40:00 | 2023-01-31 23:53:00 | NaN | 3.15 | NaN | None | 230 | 79 | 0 | 18.15 | 0.0 | 0.5 | 4.43 | 0.0 | 1.0 | 26.58 | NaN | NaN |
| 3066765 | 2 | 2023-01-31 23:07:32 | 2023-01-31 23:21:56 | NaN | 2.85 | NaN | None | 262 | 143 | 0 | 15.97 | 0.0 | 0.5 | 2.00 | 0.0 | 1.0 | 21.97 | NaN | NaN |